Model Selection

DPO optimization

# DPO optimization

Turkish Gemma 9b V0.1

Turkish-Gemma-9b-v0.1 is a Turkish text generation model developed based on Gemma-2-9b, optimized through continued pretraining, supervised fine-tuning (SFT), direct preference optimization (DPO), and model merging techniques.

Large Language Model

Ablation 141 A128.dpo.armorm.rp Shisa V2 Llama 3.1 8b

Language model fine-tuned using DPO method, suitable for text generation tasks

Large Language Model

Bytedance Research.ui TARS 7B DPO GGUF

The quantized version of UI-TARS-7B-DPO is committed to making knowledge accessible to the public.

SummLlama3-70B is a text summarization model initialized from Llama3-70B-Instruct, optimized through large-scale summarization feedback via DPO training, excelling in fidelity, completeness, and conciseness.

Large Language Model

Rhea-72b-v0.5 is a large language model fine-tuned based on Smaug-72B-v0.1, ranking first on the HuggingFace Open LLM Leaderboard.

Large Language Model

Transformers English

Nous Hermes 2 Mistral 7B DPO AWQ

Nous Hermes 2 is a next-generation flagship 7B Hermes model based on Mistral 7B DPO, optimized with DPO and demonstrating excellent performance across multiple benchmarks.

Large Language Model

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase